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Abstract 

In this study, two-state Markov switching multinomial logit models are proposed 
for statistical modeling of accident injury severities. These models assume Markov 
switching in time between two unobserved states of roadway safety. The states are 
distinct, in the sense that in different states accident severity outcomes are generated 
by separate multinomial logit processes. To demonstrate the applicability of the 
approach presented herein, two-state Markov switching multinomial logit models 
are estimated for severity outcomes of accidents occurring on Indiana roads over 
a four-year time interval. Bayesian inference methods and Markov Chain Monte 
Carlo (MCMC) simulations are used for model estimation. The estimated Markov 
switching models result in a superior statistical fit relative to the standard (single- 
state) multinomial logit models. It is found that the more frequent state of roadway 
safety is correlated with better weather conditions. The less frequent state is found 
to be correlated with adverse weather conditions. 

Key words: Accident injury severity; multinomial logit; Markov switching; 
Bayesian; MCMC 



1 Introduction 



Vehicle accidents result in property damage, injuries and loss of people lives. 
Thus, research efforts in predicting accident severity are clearly very impor- 
tant. In the past there has been a large number of studies that focused on mod- 
eling accident severity outcomes. Common modeling approaches of accident 
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severity include multinomial logit models, nested logit models, mixed logit 
models and ordered probit models (lO'Dorin ell and Connorl.ll996l:ISliankar and Mannering 
1996 : Shankar et al.l. 1996 : Duncan et al. . ]l998t iChang and Mannerina 



Carson and MannerineJ. 
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19991: 



20021: 



2001 



Lee and Mannerine 



Ulfarsson and Mannerine 



2005 
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latta. 



20021: 



2004 



20011: 



Khattak et al 



20021: 



Abdel-Atvl.l2003l:lKweon and Kockelmanl. 



Kockelman and Kweonl . 



Yamamoto and Shankai. 2004; Khorashadi et al 



Eluru and Bhatl . 120071 : ISavolainen and Mannerina . 120071 : iMilton et al. 



20031: 



20081 ). All these models involve nonlinear regression of the observed accident 
injury severity outcomes on various accident characteristics and related factors 
(such as roadway and driver characteristics, environmental factors, etc). 



In our earlier paper, iMalyshkina et al.l (120081 ). which we will refer to as Pa- 



per I, we presented two-state Markov switching count data models of accident 
frequencies. In this study, which is a continuation of our work on Markov 
switching models, we present two-state Markov switching multinomial logit 
models for predicting accident severity outcomes. These models assume that 
there are two unobserved states of roadway safety, roadway entities (road- 
way segments) can switch between these states over time, and the switching 
process is Markovian. The two states intend to account for possible hetero- 
geneity effects in roadway safety, which may be caused by various unpre- 
dictable, unidentified, unobservable risk factors that influence roadway safety. 
Because the risk factors can interact and change, roadway entities can switch 
between the two states over time. Two-state Markov switching multinomial 
logit models assume separate multinomial logit processes for accident severity 
data generation in the two states and, therefore, allow a researcher to study 
the heterogeneity effects in roadway safety. 



2 Model specification 



Markov switching models are parametric and can be fully specified by a like- 
lihood function /(Y|0, A^), which is the conditional probability distribution 
of the vector of all observations Y, given the vector of all parameters of 
model Ai. First, let us consider Y. Let Nt be the number of accidents ob- 
served during time period t, where t = 1,2, ... ,T and T is the total number 
of time periods. Let there be I discrete outcomes observed for accident sever- 
ity (for example, / = 3 and these outcomes are fatality, injury and property 
damage only). Let us introduce accident severity outcome dummies (5^^*^ that 
are equal to unity if the i^^ severity outcome is observed in the n^^ accident 
that occurs during time period t, and to zero otherwise. Here i = 1,2, ... ,1, 
n = 1,2, ... ,Nt and t = 1,2, ... ,T. Then, our observations are the accident 
severity outcomes, and the vector of all observations Y = {4*1} includes all 
outcomes observed in all accidents that occur during all time periods. Sec- 
ond, let us consider model specification variable Ai. It is = {M, Xj_„} 



2 



and includes the model's name M (for example, M = "multinomial logit") 
and the vector Xj „ of all accident characteristic variables (weather and envi- 
ronment conditions, vehicle and driver characteristics, roadway and pavement 
properties, and so on). 

To define the likelihood function, we first introduce an unobserved (latent) 
state variable St, which determines the state of all roadway entities during 
time period t. At each t, the state variable St can assume only two values: 
St = corresponds to one state and St = I corresponds to the other state {t = 
1,2, ... ,T). The state variable St is assumed to follow a stationary two-state 
Markov chain process in time0 which can be specified by time-independent 
transition probabilities as 



Here, for example, P{st+i = l\st = 0) is the conditional probability of St+i = 1 
at time t + 1, given that = at time t. Transition probabilities po^i and 
Pi^o are unknown parameters to be estimated from accident severity data. 
The stationary unconditional probabilities of states St = and St = 1 are 
po = pi^o/ (po^i+pi^o) and pi = po->i/ {po^i +Pi^o) respectively!!] Without 
loss of generality, we assume that (on average) state St = occurs more 
or equally frequently than state Sj = 1. Therefore, po > pi, and we obtain 



We refer to states St = and = 1 as "more frequent" and "less frequent" 
states respectively. 

Next, a two-state Markov switching multinomial logit (MSML) model assumes 
multinomial logit (ML) data-generating processes for accident severity in each 
of the two states. With this, the probability of the z*^ severity outcome ob- 
served in the n^^ accident during time period t is 



Markov property means that the probability distribution of st+i depends only 
on the value Sf at time t, but not on the previous history St-i, st^2^ ■ ■ ■■ Stationarity 
of {st} is in the statistical sense. 

^ These can be found from stationarity conditions pQ = {1 — po^i)Po + Pi^oPi^ 
Pi = Po^iPo + (1 - Pi^o)Pi and po + Pi = 1- 

^ Without any loss of generality, restriction ([2]) is introduced for the purpose of 
avoiding the problem of state label switching 0^1. This problem would otherwise 
arise because of the symmetry of Eqs. ([I])-® under the label switching. 



P{st+i = = 0) = po^i, P{st+i = 0\st = l) = Pi^Q. 





Po^i < Pi^o- 



(2) 
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Dii) 
t,n 



Ej=iexp(/3(o)jXf,„) 

exp(/3(i) .Xt,„) ^ ^ 

. Ej=iexp(/3'(i)_j.Xt,„) 

n= l,2,...,Ar,, t = l,2, 



(3) 



^ = 1,2,...,/, 



T 



Here prime means transpose (so /3(q-) ^ is the transpose of /3(o),i)- Parameter 



vectors /3(o),i and are unknown estimable parameters of the two st andard 
multinomial logit probability mass functions ( Washington et all . 2003 ) in the 
two states, St = and St = I respectively. We set the first component of Xf^^i 
to unity, and, therefore, the first components of vectors /3(o),i and j are 
the intercepts in the two states. In addition, without loss of generality, we set 
all /3-parameters for the last severity outcome to zero0 /3(o) / = / = 0. 



If accident events are assumed to be independent, the likelihood function is 



T Nt I 

/(Y|0,^) = nnn[^2]''"- (4) 

t=l n=l 1=1 



Here, because the state variables St^n are unobservable, the vector of all es- 
timable parameters must include all states, in addition to model parameters 
(/5-s) and transition probabilities. Thus, = [/3(q), /3(;^),po-»i5Pi-^05 S']', where 
vector S = [si, S2, st]' has length T and contains all state values. Eqs. ([1])- 
(jl]) define the two-state Markov switching multinomial logit (MSML) model 
considered here. 



3 Model estimation methods 



Statistical estimation of Markov switching models is complicated by unobserv- 
ability of the state variables SjIZI As a result, the traditional maximum likeli- 
hood estimation (MLE) procedure is of very limited use for Markov switching 
models. Instead, a Bayesian inference approach is used. Given a model M. 
with likelihood function /(Y|0, the Bayes formula is 



/(0|Y,A^)- - ^^^Y,0|A^)d0 • 



This can be done because Xf_„ are assumed to be independent of the outcome i. 
^ Below we will have 208 time periods (T = 208). In this case, there are 2^*^*^ possible 
combinations for value of vector S = [si, S2, st\' ■ 
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Here /(0|Y, A4) is the posterior probability distribution of model parameters 
conditional on the observed data Y and model A4. Function f{Y,@\A4) 
is the joint probability distribution of Y and given model A4. Function 
f{Y\Ai) is the marginal likelihood function - the probability distribution of 
data Y given model Ai. Function 7r(0|A^) is the prior probability distribution 
of parameters that reflects prior knowledge about 0. The intuition behind 
Eq. is straightforward: given model Ai, the posterior distribution accounts 
for both the observations Y and our prior knowledge of 0. 

In our study (and in most practical studies), the direct application of Eq. ([5]) is 
not feasible because the parameter vector contains too many components, 
making integration over in Eq. ([5]) extremely difficult. However, the poste- 
rior distribution /(0|Y, A^) in Eq. is known up to its normalization con- 
stant, f{&\Y,M) oc f{Y\&,M)TT{&\M). As a result, we use Markov Chain 
Monte Carlo (MCMC) simulations, which provide a convenient and practi- 
cal computational methodology for sampling from a probability distribution 
known up to a constant (the posterior distribution in our case). Given a large 
enough posterior sample of parameter vector 0, any posterior expectation and 
variance can be found and Bayesian inference can b e readily applied. A reader 
interested in details is referred to our Paper I or to iMalyshkinal ( 20081 ) . where 



we describe our choice of the prior distribution 7T{@\Ai) and the MCMC sim- 
ulation algorithm^ Although, in this study we estimate a two-state Markov 
switching multinomial logit model for accident severity outcomes and in Pa- 
per I we estimated a two-state Markov switching negative binomial model for 
accident frequencies, this difference is not essential for the Bayesian-MCMC 
model estimation methods. In fact, the main difference is in the likelihood 
function (multinomial logit as opposed to negative binomial). So we used the 
same our own numerical MCMC code, written in the MATLAB programming 
language, for model estimation in both studies. We tested our code on arti- 
ficial data sets of accident severity outcomes. The test procedure included a 
generation of artificial data with a known model. Then these data were used 
to estimate the underlying model by means of our simulation code. With this 
procedure we found that the MSML models, used to generate the artificial 
data, were reproduced successfully with our estimation code. 

For comparison of different models we use a formal Bayesian approach. Let 
there be two models Aii and A42 with parameter vectors 0i and 02 respec- 
tively. Assuming that we have equal preferences of these models, their prior 
probabilities are 7r(A1i) = 7r(A42) = 1/2. In this case, the ratio of the models' 
posterior probabilities, P(A^i|Y) and P(A^2|Y), is equal to the Bayes fac- 
tor. The later is defined as the ratio of the models' marginal likelihoods (see 
Kass and Raftery . 19951 ). Thus, we have 



Our priors for /3-s, Po-*i pi^o are flat or nearly flat, while the prior for the 
states S reflects the Markov process property, specified by Eq. ([1]). 
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PiM2\Y) /(X2,Y)//(Y) fiY\M2)niM2) f{Y\M2) 



P{Mi\Y) f{Mi,Y)/f{Y) f{Y\MMM,) f{Y\M, 



(6) 



where /(A^i, Y) and f{J^2,Y) are the joint distributions of the models and 
the data, /(Y) is the unconditional distribution of the data. As in Paper I, 
to calculate the marginal likelihoods /(Y|A^i) and f(Y\M.2), we use the 
harmonic mean formula /(Y|A^)~^ = E [/(Y|0, M)^^\ Y], where E{. . . |Y) 
means posterior expectation calculated by using the posterior distribution. If 
the ratio in Eq. is larger than one, then model Ai2 is favored, if the ratio 
is less than one, then model A^i is favored. An advantage of the use of Bayes 
factors is that it has an inherent penalty for including too many parameters 
in the model and guards against overfitting. 



To evaluate the performance of model {Ai, 0} in fitt ing the observed data Y . 
we carry out the Pearson's goodness-of-fit test (IMaher and Summersgill 



19961 : ICowanl . Il998l : IWoodl . 12002 : IPress et all . l2007l ). We perform this test by 



Monte Carlo simulations to find the distribution of the Pearson's quan- 
tity, which r neasures the d iscrepancy between the observations and the model 
predictions (ICowanl . ll998l ). This distribution is then used to find the goodness- 
of-fit p- value, which is the probabihty that exceeds the observed value of 
under the hypothesis that the model is true (the observed value of is 



calculated by usin g the observed data Y) 
Malvshkinal tOO^ . 



For additional details, please see 



4 Empirical results 



The severity outcome of an accident is determined by the injury level sustained 
by the most injured individual (if any) involved into the accident. In this study 
we consider three accident severity outcomes: "fatality", "injury" and "PDO 
(property damage only)", which we number as 2 = 1, 2, 3 respectively (/ = 3). 
We use data from 811720 accidents that were observed in Indiana in 2003-2006. 
As in Paper I, we use weekly time periods, t = 1, 2, 3, . . . , T = 208 in totalQ 
Thus, the state St can change every week. To increase the predictive power 
of our models, we consider accidents separately for each combination of acci- 
dent type (1-vehicle and 2- vehicle) and roadway class (interstate highways, US 
routes, state routes, county roads, streets) . We do not consider accidents with 
more than two vehicles involved] I Thus, in total, there are ten roadway-class- 
accident-type combinations that we consider. For each roadway-class-accident- 



A week is from Sunday to Saturday, there are 208 full weeks in the 2003-2006 
time interval. 

s Among 811720 accidents 241011 (29.7%) are l-vehicle, 525035 (64.7%) are 2- 
vehicle, and only 45674 (5.6%) are accidents with more than two vehicles involved. 
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type combination the following three types of accident frequency models are 
estimated: 



First, we estimate a standard multinomial logit (ML) model without Markov 
switching by maximum likelihood estimation (MLE)0 We refer to this 
model as "ML-by-MLE". 

Second, we estimate the same standard multinomial logit model by the 
Bayesian inference approach and the MCMC simulations. We refer to this 
model as "ML-by-MCMC" . As one expects, the estimated ML-by-MCMC 
model turned out to be very similar to the corresponding ML-by-MLE model 
(estimated for the same roadway-class-accident-type combination). 
Third, we estimate a two-state Markov switching multinomial logit (MSML) 
model by the Bayesian-MCMC methods. In order to make comparison of ex- 
planatory variable effects in different models straightforward, in the MSML 
model we use only those explanatory variables that enter the corresponding 



standard ML model[l^ To obtain the final MSML model reported here, we 
also consecutively construct and use 60%, 85% and 95% Bayesian credible 
intervals for evaluation of the statistical significance of each /5-parameter. 
As a result, in the final model some components of j3(Q) andd/D are re- 



stricted to zero or restricted to be the same in the two statesHiJ We refer 
to this final model as "MSML" . 



Note that the two states, and thus the MSML models, do not have to exist 
for every roadway-class-accident-type combination. For example, they will not 
exist if all estimated model parameters turn out to be statistically the same 
in the two states, /3(o) = /3{i), (which suggests the two states are identical and 
the MSML models reduce to the corresponding standard ML models). Also, 
the two states will not exist if all estimated state variables St turn out to be 
close to zero, resulting in po^i <^ Pi^o [compare to Eq. ([2])], then the less 



^ To obtain parsimonious standard models, estimated by MLE, we choose the 
explanatory variables and their dummies by using the Akaike Information Criterion 
(AIC) and the 5% statistical significance level for the two-tailed t-test. Minimization 
of AIC = 2K — 2LL, were K is the number of free continuous model parameters 
and LL is the log-likelihood, e nsures an optimal choice of explanat ory variables in 



Washington et al.( . bood l . For details on 



a model and avoids ov erfitting (iTsayl. |200 
variable selection, see iMalvshkinal (|2006l ). 

A formal Bayesian approach to model variable selection is based on evaluation 
of model's marginal likelihood and the Bayes factor ([6]). Unfortunately, because 
MCMC simulations are computationally expensive, evaluation of marginal likeli- 
hoods for a large number of trial models is not feasible in our study. 

A /3-parameter is restricted to zero if it is statistically insignificant. A /J-parameter 
is restricted to be the same in the two states if the difference of its values in the 
two states is statistically insignificant. A (1 — a) credible interval is chosen in such 
way that the posterior probabilities of being below and above it are both equal to 
a/2 (we use significance levels a = 40%, 15%, 5%). 
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frequent state = 1 is not realized and the process stays in state St = 0. 



Turning to the estimation results, the findings show that two states of roadway 
safety and the appropriate MSML models exist for severity outcomes of 1- 
vehicle accidents occurring on all roadway classes (interstate highways, US 
routes, state routes, county roads, streets), and for severity outcomes of 2- 
vehicle accidents occurring on streets. We did not find two states in the cases of 
2- vehicle accidents on interstate highways, US routes, state routes and county 
roads (in these cases all estimated state variables St were found to be close to 
zero). The model estimation results for severity outcomes of 1-vehicle accidents 
occurring on interstate highways, US routes and state routes are given in 
Tables [TH31 AH continuous model parameters (/3-s, Po^i and pi^o) are given 
together with their 95% confidence intervals (if MLE) or 95% credible intervals 
(if Bayesian-MCMC), refer to the superscript and subscript numbers adjacent 



to parameter estimates in Tables [THMIH Table H] gives summary statistics of 



all roadway accident characteristic variables Xi„ (except the intercept). 



Note that MLE assumes asymptotic normality of the estimates, resulting in con- 
fidence intervals being symmetric around the means (a 95% confidence interval is 
±1.96 standard deviations around the mean). In contrast, Bayesian estimation does 
not require this assumption, and posterior distributions of parameters and Bayesian 
credible intervals are usually non-symmetric. 
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Table 1 



Estimation results for multinomial logit models of severity outcomes of one-vehicle accidents on Indiana interstate highways 
(the superscript and subscript numbers to the right of individual parameter estimates are 95% confidence/credible intervals) 



Variable 


ML-by-MLE'' 


ML-by-MCMC'' 


MSML<= 


state s = 


state s = 1 


fa.t3.lity 


in iiirv 


fatality 


in iiiT*v 


fatality 


in iiirv 


fatality 


in iiir-v 


Intercept (constant term) 


"-13.7 


—3 69-? 5? 

u.un_3 g4 


-12 4-l°i 

-14.5 


-3 72~?-?5 

—3.88 


-12 2-l°A 
-14.4 


—3 98~? TE 


-12 2-l°A 
-14.4 


—3 22~?'?? 
-3.45 


Summer season (dummy) 


9or;.329 
.ZOO 


.ZOO j^42 


907.329 
•^■^'.143 


237-329 
■^■^'.143 


176-293 
•1"^.0551 


•1'°.0561 


176-293 
■1 "J. 0551 


.DIO 282 


Thursday (dummy) 


•'^°-1.48 




- 853"-2"« 

■"'-''■'-1.59 




— 872^'225 
■"'^-1.61 




— 872^'225 
■"'^-1.61 




Construction at the accident location (dummy) 


- 418"-2^-^ 

•^-^"-.623 


•^-^"-.623 


- 425--224 

■^^•-'-.632 


- 425"'224 

■^^•-'-.632 


■•^""-.822 


- 566- '^i^ 

■•^""-.822 


- 566- '^i'^ 

■•^'^"-.822 




Daylight or street lights are lit up if dark (dummy) 




107.224 
.0501 


.00 ( __74o 


143-230 
•-^.0568 


- 378^'So^'* 

.<j( o_ 729 


1 -^0.226 
•^'-'".0522 


- 378" 

.oj o_ 729 


10Q.226 
■^'-"'.0522 


Precipitation: rain/freezing rain/snow/sleet/hail (dummy) 


1 QQ — .830 
-l.ci»_i 92 


.ODl_ 4gy 


l--^-'--1.99 


— ^fi'^~-2®''' 
. 000 _ 460 


-1 ■54-1-03 


. 000 _ 729 


-1 ■54-1-03 

1.041_2 iQ 




Roadway surface is covered by snow/slush (dummy) 




— 4'^2~'2^'' 

•^■^^-.583 


— 1 40^.328 
-^•^■^-2.84 


■^■^"-.590 


-.05151:^7! 


-•0515=:6?1 


-.05151:3?! 


-.05151:36} 


Roadway median is drivablc (dummy) 


■571:li 




1:77.939 
•0' '.223 




.566:130 




.566:130 




Roadway is at curve (dummy) 


•ii4:2Ji5 


1 1 4-212 
■-^-^^.0165 


116-213 
•11". 0186 


116-213 
•-11'^. 0186 










Primary cause of the accident is driver-related (dummy) 


4.24|:f0 


1-'J'-'1.43 


4 'iq5.64 
^••^^3.39 


i-54l:!| 


4-48i:Ii 


2.oo2:i» 


4.485:11 


■7i5:!Ji 


Help arrived in 20 minutes or less after the crash (dummy) 


■ '^'-'.693 


7Qn.887 


7Qn.891 
■ '^'^.691 


•790:iJ 


•785:it 


•785:it 


■785:11 


70 c. 886 
■ '°''.684 


The vehicle at fault is a motorcycle (dummy) 


r> ss4.59 


^- '^2.36 


3-87^:?3 


2.753:15 


^■"-^3.74 


9q3.83 
O.ZO2 70 




1 •^q2.49 
^■•^^.326 


Age of the vehicle at fault (in years) 


.0285:0370 


n9ac;.0370 

.UZCSO y2„i 


.0286:113™ 


.0286:S]3™ 




■0286:J|3,yj 




■0286:J|'^,^i 


Number of occupants in the vehicle at fault 


.ODD 


■^^■^.0859 


.367:«t 


123-159 
•l^'-'.0861 


■366:i| 


1 24-161 
•1^^.0874 


.366:«| 


1 24-161 
■1^^.0874 


Roadway traveled by the vehicle at fault is multi-lane and 
divided two-way (dummy) 


Z.DUj^ 20 




2-86t:ii 




2.86f:|6 




2.864:66 




At least one of the vehicles involved was on fire (dummy) 


1.24212 




1 1S2.02 


_ 045-.O335 

.0'iO_ ggg 


1.662^56 


009- 0198 

.OOZ_ ggg 




009-. 0198 

.OOZ_ ggg 


Gender of the driver at fault (dummy) 




000.410 

.32» 246 




331 -"113 
.001 248 




224-338 
•^^^.107 




47Q.637 
■^' ^.328 



Table 1 



(Continued) 



Variable 


ML-by-MLE'' 


ML-by-MCMC 


MSML<^ 


state s = 


state s = 1 


fatality 


injury 


fatality 


injury 


fatality 


injury 


fatality 


injury 


Probability of severity outcome [P^'^ given by Eq. averaged 
over all values of explanatory variables Xt,n 






.00724 


.176 


.00733 


.174 


.00672 


.192 


Markov transition probability of jump — > 1 (po— •!) 






1 CI. 254 
•-^•-'-^.0704 


Markov transition probability of jump 1 — > (pi^o) 






■330:fg 


Unconditional probabilities of states and 1 (po and pi ) 






.683:114 and .317:4g] 


Total number of free model parameters (/3-s) 


25 


25 


28 


Posterior average of the log-likelihood (LL) 




-8486.7811^^°:^ 


-8396.78lg79-2i 


Max(LL): estimated max. log-likelihood (LL) for MLE; 
maximum observed value of LL for Bayesian-MCMC 


-8465.79 (MLE) 


-8476.37 (observed) 


-8358.97 (observed) 


Logarithm of marginal likelihood of data (ln[/(Y A4)]) 




-8498 46-8494-22 


07—8424.77 
"^■J'-^' -8440.02 


Goodness-of-fit p-value 




0.255 


0.222 


Maximum of the potential scale reduction factors (PSRF) 




1.00302 


1.00060 


Multivariate potential scale reduction factor (MPSRF) 




1.00325 


1.00067 


Number of available observations 


accidents = fatalities + injuries + PDOs: 19094 = 143 -I- 3369 -I- 15582 



^ Standard (conventional) multinomial logit (ML) model estimated by maximum likelihood estimation (MLE). 
^ Standard multinomial logit (ML) model estimated by Markov Chain Monte Carlo (MCMC) simulations. 

Two-state Markov switching multinomial logit (MSML) model estimated by Markov Chain Monte Carlo (MCMC) simulations. 
"1 PSRF/MPSRF are calculated separately/jointly for all continuous model parameters. PSRF and MPSRF are close to 1 for converged MCMC chains. 



Table 2 



Estimation results for multinomial logit models of severity outcomes of one- vehicle accidents on Indiana US routes 

(the superscript and subscript numbers to the right of individual parameter estimates are 95% confidence/credible intervals) 



Variable 


ML-by-MLE'' 


ML-by-MCMC 


MSML<= 


state s = 


state s = 1 


fatality 


injury 


fatality 


injury 


fatality 


injury 


fatality 


injury 


Intercept (constant term) 


-6 51-5.00 
"■^J^_8.03 




-6.62l|;« 


-2-i2lJ;I? 


4.69 
'^—6.92 


z.uo_2 40 


p. 79-4.69 
"• ' ^—6.92 


— 9 7Q — 2.37 
^. — 3.23 


Summer season (dummy) 


•5i4:fil 


.ZUU Qg4y 


■509:fit 


•2oo:ggii 


.J.yU g7gg 


.J.yU g7gg 


iQfi.aoo 

.±yU Q7gg 




Daylight or street lights are lit up if dark (dummy) 




1Q4.287 


-■4921:^^^ 


.203;??6 


-■4931:^^? 


107.290 

.105 




107.290 
.105 


Snowing weather (dummy) 










-i-io:2'27 


■-'-"'-'.0115 


-1.1012^^^ 


1(5^.317 
■-'-"'^.0115 


No roadway junction at the accident location (dummy) 


■'"^.149 


01 '7.335 
■^-'-'.0994 


7971.31 
■ .199 


910. 331 
■^-^■^.0968 


•787i2i 


91 /1 .332 
■^-'-^.0965 


■787i2i 


9-14.332 


Roadway is straight (dummy) 


— 741 

•'^^-1.10 




•'■^^-1.09 


-296I:«^ 


ly Q'7~ ■ 372 

~ ' -1.09 


•^^^-.398 




_ 994 --189 
•^^^-.398 


Primary cause of the accident is environment-related (dummy) 


•J-^a_4.l8 




_r! K1 -2.81 
•-'■^^-4.32 


-i.89_2;oo 


_o rq-2.89 
o.dy_4 40 


-2mzlil 


_o ^q-2.89 
o.d»_4 40 




Help arrived in 10 minutes or less after the crash (dummy) 




■594:igi 


.562:650 




.560:1^1 


.560:1^1 


.560:t« 


.560:«^8 


The vehicle at fault is a motorcycle (dummy) 




2(13.55 


2-57?:l 


091 3.56 
•^■^^2.87 


993.58 
•^•^^2.88 


Q 993.58 
•^•^^2.88 


Q 998.68 

'^■''^2.88 


993.58 
'^■^^2.88 


Age of the vehicle at fault (in years) 


.0363:«i| 


r)^po.0444 


(-,007.0448 


(-|Of;7.0448 




r)Of;o.0447 
.UJDD 0285 




r)Of.o.0447 
.UODD 0285 


Speed limit (used if known and the same for all vehicles involved) 


n'?K'i.0631 

.UOOO QQggQ 


Qi 01 .0178 


no7o.0643 
•"'-"'-'.0117 


m 1 O.0176 

.UllO gggjg 


(i9aci.0495 
.UZOO Q^Q4 


(11(10.0178 




(1190.0178 
.uizu gog35 


Roadway traveled by the vehicle at fault is two-lane and 

one-way (([uiiiiiiy) 






_ 9.j.,.n,'',i7 
■--■'-,:«)« 


^-.3y8 


_ ,.iir,ij4 


_ ,.iir,ij4 


^-.401 


-.221-^14 


At least one of the vehicles involved was on fire (dummy) 


1 iqi-94 
-^•■^^.439 




^•^■^.315 




1 971.98 

-^■^'.452 




l-^' .452 




Age of the driver at fault (in years) 


Q114.O213 




(1110.0211 
■"^-^•^.00137 




(11Q1.O2OO 
■"^"■^.0000542 








Weekday (Monday through Friday) (dummy) 




■-^^-^-.196 




_ 104.0124 

■-^"^-.196 




-| 9c:. 0242 
.1Z0_ 227 






Gender of the driver at fault (dummy) 




979. 3G2 
■^'^.183 




•^'".186 




•280;i« 




■280;?6« 



Table 2 



(Continued) 



Variable 


ML-by-MLE'' 


ML-by-MCMC 


MSML<= 


state s = 


state s = 1 


fatality 


injury 


fatality 


injury 


fatality 


injury 


fatality 


injury 


Probability of severity outcome [P}'^ given by Eq. averaged 
over all values of explanatory variables Xt,n 






.00747 


.179 


.00823 


.183 


.00218 


.158 


Markov transition probability of jump 0-^1 (po^i) 






.0767:i57g 


Markov transition probability of jump 1 — » (pi^o) 






■6i3:ir7 


Unconditional probabilities of states and 1 (po and pi ) 






■887:??9 and .113:g0g 


Total number of free model parameters (/3-s) 


24 


24 


25 


Posterior average of the log-likelihood (LL) 




-7406 39-7400.61 
'^'-'"■•^='-7414.03 


— 7349 nfj- 7335.46 
'■^^"•""-7364.47 


Max(LL): estimated max. log-likelihood (LL) for MLE; 
maximum observed value of LL for Bayosian-MCMC 


-7384.05 (MLE) 


-7396.37 (observed) 


-7318.21 (observed) 


Logarithm of marginal likelihood of data (ln[/(Y A4)]) 




— 741 7 QS-7413.72 


-7377 49-7369.62 


Goodness-of-fit p-value 




0.337 


0.255 


Maximum of the potential scale reduction factors (PSRF) 




1.00319 


1.00073 


Multivariate potential scale reduction factor (MPSRF) 




1.00376 


1.00085 


Number of available observations 


accidents = fatalities -|- injuries -|- PDOs: 17797 = 138 -I- 3184 + 14485 



^ Standard (conventional) multinomial logit (ML) model estimated by maximum likelihood estimation (MLE). 
^ Standard multinomial logit (ML) model estimated by Markov Chain Monte Carlo (MCMC) simulations. 
Two-state Markov switching multinomial logit (MSML) model estimated by Markov Chain Monte Carlo (MCMC) simulations. 

PSRF/MPSRF are calculated separately/jointly for all continuous model parameters. PSRF and MPSRF are close to 1 for converged MCMC chains. 



Tabic 3 



Estimation results for multinomial logit models of severity outcomes of one- vehicle accidents on Indiana state routes 

(the superscript and subscript numbers to the right of individual parameter estimates are 95% confidence/credible intervals) 



Variable 


ML-by-MLE" 


ML-by-MCMC 


MSML^^ 


state s = 


state s = 1 


fatality 


injury 


fatality 


injury 


fatality 


injury 


fatality 


injury 


TntPTPprtl" [ pormtfi nt tpvm 1 


-3.98:i«[; 




—4 rm~^-^^ 




-3.4411^° 


-i.68:i:|t 




-i-68:li^ 


Summer season (dummy) 


OQQ.307 


■232:??^ 


009.307 
•^■^^.157 


.232i°l 


OQQ.314 
.zoo jgg 


000.314 

.zoo j^gg 


oqQ.314 
.zoo igg 


OOQ.314 
.zoo ig3 


Roadway type (dummy: 1 if urban, if rural) 




••^^^-.478 


■■J»'J_.483 


_ oQt.-.306 
.o»o_ 4g3 




one:-. 296 
^••JOO_.474 


— 2.05_3^g2 


q oc:-.296 
— J.St>_ 4^4 


Daylight or street lights are lit up if dark (dummy) 




■i93:?i 


-6411:^?;^ 


inn. 267 
•-^''^.132 


-.689:-;^« 




-68911^? 


977.378 
.Z( ( 4-,-, 


Precipitation: rain/freezing rain/snow/sleet/hail (dummy) 


— S^4'4*'^ 




-.868I,«^, 




-82914^^ 




-829l4^« 




Roadway median is drivable (dummy) 


— .583_;94Q 




-■596I:9g4 




— .589_:ggJ 




-.589_:ggj!, 




Roadway is straight (dummy) 


-2841:1^^ 


-•2841:1^:1 


-2831:^^^ 


„oq-.214 
-■^»3_.362 


1 17-. 0184 
•-^■^'-.214 


1 17-. 0184 
-.214 


117-. 0184 
■^^'-.214 


-4651:*?° 


Primary cause of the accident is environment-related (dummy) 




-1.83-:™ 


-4-28lt«^ 


-i.84:li? 


-4.4oii:Ig 


-2.3o:^:l^ 




-i.4i:i:^^ 


Help arrived in 20 minutes or less after the crash (dummy) 


.840;«i^ 


.840:«i5 


.863:?« 


■863:?t! 




.86i:?44 


1 642-64 


•86i:?|| 


The vehicle at fault is a motorcycle (dummy) 


o 103.31 


o 1 n3.31 
■'•^'-'2.89 


in3.31 
•^■^"2.89 


3 in3.3l 
■^•^"2.89 


q 073.66 
'-*-'^'3.09 


q 073.66 
3.09 


073.66 
•^■■^'3.09 


2-82i:i? 


Number of occupants in the vehicle at fault 




.UOO( 0265 


.UODO Q276 


ncf;r;.0858 
.UODO 2276 


•0942:Jil8 


•0942:iii8 


.0942:j!,||8 




At least one of the vehicles involved was on fire (dummy) 


i-9o?:|i 


■456 lii 


i.87?:i 


447.768 
.124 


i.87?:i 


•461:11? 


i.87?:i 


.461:11? 


Age of the driver at fault (in years) 


■■■^■07.80 

X 10' 


-2.80:f» 
X 10"-^ 


14.5?ie? 
X 10'^ 


— 2 71 "■'^23 

X 10"'^ 


14.5|^6l 
X 10"* 


-2.46:4«! 

X 10'^ 


14.5?y 
X 10"* 


-2.4614''!! 
X 10'* 


Gender of the driver at fault (dummy) 




070.344 


-.505Zfgl 


070.343 
213 


■^'•'-.764 


0QQ.348 
.zoo 218 


- 473-1^2 

•^'■^-.764 


000.348 
.zoo 218 


Age of the vehicle at fault (in years) 




no^4.0392 




.0335:0277 




nooo.0390 
.uooz 0274 




noQo 0390 
.uooz 0274 


license state of the vehicle at fault is a U.S. state except Indiana 
and its neighboring states (IL, KY, OH, MI)" indicator variable 




-■449I:21J 




— 444- -^il 

•^^^-.679 




-.4361:208 




— 436" '208 

.40D_ gyi 



Table 3 



(Continued) 



Variable 


ML-by-MLE'' 


ML-by-MCMC 


MSML'^ 


state s = 


state s = 1 


fatality 


injury 


fatality 


injury 


fatality 


injury 


fatality 


injury 


Probability of severity outcome [P^'^ given by Eq. averaged 
over all values of explanatory variables Xt,n 






.0089 


.179 


.00951 


.180 


.00804 


.179 


Markov transition probability of jump — > 1 (po— •!) 






qoc-.465 
■'^''■^.216 


Markov transition probability of jump 1 — > (pi^o) 






■450:«» 


Unconditional probabilities of states and 1 (po and pi ) 






.574:681 and .426;«6 


Total number of free model parameters (/3-s) 


22 


22 


28 


Posterior average of the log-likelihood (LL) 




-13867 AOZllTrl fa 


1 QYo-i '7fr — 13765.02 

10(01.(U_i3gQQ gg 


Max(LL): estimated max. log-likelihood (LL) for MLE; 
maximum observed value of LL for Bayesian-MCMC 


-13846.60 (MLE) 


-13858.00 (observed) 


-13745.61 (observed) 


Logarithm of marginal likelihood of data (ln[/(Y A4)]) 




— 13877. 89_;^3ggQ3g 


-13820.20:««0«:?5 


Goodness-of-fit p-value 




0.515 


0.445 


Maximum of the potential scale reduction factors (PSRF) 




1.00027 


1.00029 


Multivariate potential scale reduction factor (MPSRF) 




1.00041 


1.00045 


Number of available observations 


accidents = fatalities + injuries -|- PDOs: 33528 = 302 -|- 6018 -I- 27208 



^ Standard (conventional) multinomial logit (ML) model estimated by maximum likelihood estimation (MLE). 
^ Standard multinomial logit (ML) model estimated by Markov Chain Monte Carlo (MCMC) simulations. 

Two-state Markov switching multinomial logit (MSML) model estimated by Markov Chain Monte Carlo (MCMC) simulations. 
"1 PSRF/MPSRF are calculated separately/jointly for all continuous model parameters. PSRF and MPSRF are close to 1 for converged MCMC chains. 



The top, middle and bottom plots in Figure [T] show weekly posterior proba- 
bilities P{st = 1|Y) of the less frequent state = 1 for the MSML models 
estimated for severity of 1-vehicle accidents occurring on interstate highways, 
US routes and state routes respect ivelyP^ Because of space limitations, in this 
paper we do not report estimation results for severity of 1-vehicle accidents on 
county roads and streets, and for severity of 2-vehicle accidents. However, be- 
low we discuss our findings for all roadway- class-acciden t -type combinations. 



For unreported model estimation results see iMalyshkinal (120081 ). 



We find that in all cases when the two states and Markov switching multi- 
nomial logit (MSML) models exist, these models are strongly favored by the 
empirical data over the corresponding standard multinomial logit (ML) mod- 
els. Indeed, from lines "marginal LL" in Tables [lH3] we see that the MSML 
models provide considerable, ranging from 40.5 to 61.4, improvements of the 
logarithm of the marginal likelihood of the data as compared to the corre- 



sponding ML models] I Thus, from Eq. we find that, given the accident 
severity data, the posterior probabilities of the MSML models are larger than 
the probabilities of the corresponding ML models by factors ranging from e^°'^ 
to e^^-^. In the cases of 1-vehicle accidents on county roads, streets and the 
case of 2-vehicle accidents on streets, MSML models (not reported here) are 
also strongly favore d by the empirical data over the corresponding ML models 



flMalvshkinal . l2008f ). 



Let us now consider the maximum likelihood estimation (MLE) of the standard 
ML models and an imaginary MLE estimation of the MSML models. We 
find that, in this imaginary case, a classical statistics approach for model 
comparison, based on the MLE, would also favors the MSML models over the 
standard ML models. For example, refer to line "max(LL)" in Table [T] given 
for the case of 1-vehicle accidents on interstate highways. The MLE gave 
the maximum log-likelihood value —8465.79 for the standard ML model. The 
maximum log-likelihood value observed during our MCMC simulations for the 
MSML model is equal to —8358.97. An imaginary MLE, at its convergence, 
would give a MSML log-likelihood value that would be even larger than this 
observed value. Therefore, if estimated by the MLE, the MSML model would 
provide large, at least 106.82 improvement in the maximum log-likelihood 
value over the corresponding ML model. This improvement would come with 
only modest increase in the number of free continuous model parameters (/?- 
s) that enter the likelihood function (refer to Table [T] under "7^ free par."). 



Note that these posterior probabilities are equal to the posterior expectations of 
St, P{st = 1|Y) = 1 X P{st = 1|Y) + X P{st = 0|Y) = E{st\Y). 

We use the harmonic mean formula to calculate the values and the 95% confidence 
intervals of the log- marginal- likelihoods given in lines "marginal LL" of Tables [TH3l 
The confi dence interva l s are calculated by bootstrap simulations. For details, see 
Paper I or iMalvshkinal (|2008l ^. 
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Fig. 1. Weekly posterior probabilities P{st = 1|Y) for the MSML models estimated 
for severity of 1-vehicle accidents on interstate highways (top plot), US routes (mid- 
dle plot) and state routes (bottom plot). 



Similar arguments hold for comparison of MSML and ML models estimated 
for other roadway-class-accident-type combinations (see Tables |2] and [3]) . 

To evaluate the goodness-of-fit for a model, we use the posterior (or MLE) 
estimates of all continuous model parameters (/5-s, a, po-^i, Pi->o) a nd g enerate 
10^ artificial data sets under the hypothesis that the model is truellfj We find 
the distribution of and c alculate the goodne ss-of-fit p- value for the observed 



value of x^- For details, see iMalyshkinal (120081 ). The resulting p- values for our 



models are given in TablesdHSl These p- values are around 00-100%. Therefore, 
all models fit the data well. 
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Note that the state values S are generated by using po^i and pi- 
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Now, refer to Table [51 The first six rows of this table list time-correlation coef- 
ficients between posterior probabilities P{st = 1|Y) for the six MSML models 
that exist and are estimated for six roadway-class-accident-type combinations 
(1- vehicle accidents on interstate highways, US routes, state routes, county 
roads, streets, and 2- vehicle accidents on streets)]^ We see that the states for 
1-vehicle accidents on all high-speed roads (interstate highways, US routes, 
state routes and county roads) are correlated with each other. The values of 
the corresponding correlation coefficients are positive and range from 0.263 to 
0.688 (see Tabled]). This result suggests an existence of common (unobserv- 
able) factors that can cause switching between states of roadway safety for 
1-vehicle accidents on all high-speed roads. 

The remaining rows of Table [5] show correlation coefficients between poste- 
rior probabilities P{st = 1|Y) and weather-condition variables. These cor- 
relations were found by using daily and hourly historical weather data in 
Indiana, available at the Indiana State Climate Office at Purdue University 
(www.agry.purdue.edu/climate). For these correlations, the precipitation and 
snowfall amounts are daily amounts in inches averaged over the week and 
across Indiana weather observation stationsIiLl The temperature variable is 
the mean daily air temperature {°F) averaged over the week and across the 
weather stations. The wind gust variable is the maximal instantaneous wind 
speed (mph) measured during the 10-minute period just prior to the obser- 
vational time. Wind gusts are measured every hour and averaged over the 
week and across the weather stations. The effect of fog/ frost is captured by a 
dummy variable that is equal to one if and only if the difference between air 
and dewpoint temperatures does not exceed 5°F (in this case frost can form 
if the dewpoint is below the freezing point 32° F, and fog can form otherwise). 
The fog/frost dummies are calculated for every hour and are averaged over 
the week and across the weather stations. Finally, visibility distance variable is 
the harmonic mean of hourly visibility distances, which are measured in miles 
every hour and are averaged over the week and across the weather stations] 

From the results given in Table [5] we find that for 1-vehicle accidents on all 
high-speed roads (interstate highways, US routes, state routes and county 
roads), the less frequent state St = 1 is positively correlated with extreme 
temperatures (low during winter and high during summer), rain precipitations 
and snowfalls, strong wind gusts, fogs and frosts, low visibility distances. It 



Here and below we calculate weighted correlation coefficients. For variable P{st = 
1|Y) = E{st\Y) we use weights Wt inversely proportional to the posterior standard 
deviations of st- That is wt oc min {l/std(st |Y), median[l/std(st|Y)]}. 
^"^ Snowfall and precipitation amounts are weakly related with each other because 
snow density (g/cm^) can vary by more than a factor of ten. 

-"^^ The harmonic mean J of distances dn is calculated as = (1/-^) X]^=i '^n ^) 
assuming dn = 0.25 miles if dn < 0.25 miles. 
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is reasonable to expect that roadway safety is different during bad weather 
as compared to better weather, resulting in the two-state nature of roadway 
safety. 



The results of Table [5] suggest that Markov switching for road safety on streets 
is very different from switching on all other roadway classes. In particular, 
the states of roadway safety on streets exhibit low correlation with states on 
other roads. In addition, only streets exhibit Markov switching in the case 
of 2-vehicle accidents. Finally, states of roadway safety on streets show little 
correlation with weather conditions. A possible explanation of these differences 
is that streets are mostly located in urban areas and they have traffic moving 
at speeds lower that those on other roads. 



Next, we consider the estimation results for the stationary unconditional prob- 
abilities po and pi of states St = and St = I for MSML models (see Section [2]). 
In the cases of 1-vehicle accidents on interstate highways, US routes and state 
routes these transition probabilities are listed in lines "po and pi" of Tables [1]- 
131 In the cases of 1-vehicle accidents on couri ty roads and 1- and 2-vehicle 
accidents on streets refer to iMalyshkinal (120081 ). We find that the ratio pi/po 
is approximately equal to 0.46, 0.13, 0.74, 0.25, 0.65 and 0.36 in the cases 
of 1-vehicle accidents on interstate highways, US routes, state routes, county 
roads, streets, and 2-vehicle accidents on streets respectively. Thus for some 
roadway-class-accident-type combinations (for example, 1-vehicle accidents on 
US routes) the less frequent state St = 1 is quite rare, while for other combi- 
nations (for example, 1-vehicle accidents on state routes) state = 1 is only 
slightly less frequent than state St = 0. 



Finally, we set model parameters (/5-s) to their posterior means, calculate the 
probabilities of fatality and injury outcomes by using Eq. (jH]) and average 
these probabilities over all values of the explanatory variables Xj „ observed 
in the data sample. We compare these probabilities across the two states of 
roadway safety, St = and St = 1, for M SML models [refer to lines "(-P/,n)x" 
in Tables [TH3] and to IMalyshkinal (120081 )]. We find that in many cases these 
averaged probabilities of fatality and injury outcomes do not differ very signif- 
icantly across the two states of roadway safety (the only significant differences 
are for fatality probabilities in the cases of 1-vehicle accidents on US routes, 
county roads and streets). This means that in many cases states St = and 
St = I are approximately equally dangerous as far as accident severity is con- 
cerned. We discuss this result in the next section. 
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5 Conclusions 



In this study we found that two states of roadway safety and Markov switch- 
ing multinomial logit (MSML) models exist for severity of 1-vehicle accidents 
occurring on high-speed roads (interstate highways, US routes, state routes, 
county roads), but not for 2- vehicle accidents on high-speed roads. One of pos- 
sible explanations of this result is that 1- and 2- vehicle accidents may differ 
in their nature. For example, on one hand, severity of 1-vehicle accidents may 
frequently be determined by driver-related factors (speeding, falling a sleep, 
driving under the influence, etc). Drivers' behavior might exhibit a two-state 
pattern. In particular, drivers might be overconfident and/or have difficulties 
in adjustments to bad weather conditions. On the other hand, severity of a 
2-vehicle accident might crucially depend on the actual physics involved in 
the collision between the two cars (for example, head-on and side impacts are 
more dangerous than rear-end collisions). As far as slow-speed streets are con- 
cerned, in this case both 1- and 2-vehicle accidents exhibit two-state nature 
for their severity. Further studies are needed to understand these results. In 
this study, the important result is that in all cases when two states of roadway 
safety exist, the two-state MSML models provide much superior statistical fit 
for accident severity outcomes as compared to the standard ML models. 

We found that in many cases states = and St = I are approximately 
equally dangerous as far as accident severity is concerned. This result holds 
despite the fact that state St = 1 is correlated with adverse weather conditions. 
A likely and simple explanation of this finding is that during bad weather 
both number of serious accidents (fatalities and injuries) and number of minor 
accidents (PDOs) increase, so that their relative fraction stays approximately 
steady. In addition, most drivers are rational and they are likely take some 
precautions while driving during bad weather. From the results presented in 
Paper I we know that the total number of accidents significantly increases 
during adverse weather conditions. Thus, driver's precautions are probably 
not sufficient to avoid increases in accident rates during bad weather. 
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Table 4 

Summary statistics of roadway accident characteristic variables 



Vai'iable 


Description 




probability of i**" severity outcome averaged over all values of explanatory variables Xt_„ 




Markov transition probability of jump from state to state 1 as time t increases to t + 1 




Markov transition probability of jump from state 1 to state as time t increases to t + 1 


po and pi 


unconditional probabilities of states and 1 


# free pax. 


total number of free model coefficients (/3-s) 


averaged LL 


posterior average of the log-likelihood (LL) 


max(LL) 


for MLE it is the maximal value of LL at convergence; for Bayesian-MCMC estimation 

it is the maximal observed value of LL during the MCMC simulations 


marginal LL 


logarithm of marginal likelihood of data, In[/(Y1A1)], given model A4 


max(PSRF) 


maximum of the potential scale reduction factors (PSRF) calculated separately for all 
continuous model parameters, PSRF is close to 1 for converged MCMC chains 


MPSRF 


multivariate PSRF calculated jointly for all parameters, close to 1 for converged MCMC 


accept, rate 


average rate of acceptance of candidate values during Metropolis-Hasting MCMC draws 


# obscrv. 


number of observations of accident severity outcomes available in the data sample 


ageO 


" age of the driver at fault is < 18 years" indicator variable (dummy) 


ageOo 


"age of the oldest driver involved into the accident is < 18 years" indicator variable 


cons 


"construction at the accident location" indicator variable 


curve 


"road is at curve" indicator variable 


dark 


"dark time with no street lights" indicator variable 


darklamp 


"dark AND street lights on" indicator variable 


day 


"daylight" indicator variable 


dayt 


"day hours: 9:00 to 17:00" indicator variable 


driv 


"road median is drivable" indicator variable 


driver 


"primary cause of the accident is driver-related" indicator variable 


dry 


"roadway surface is dry" indicator variable 


env 


"primary cause of the accident is environment-related" indicator variable 


fog 


"fog OR smoke OR smog" indicator variable 


hllO 


"help arrived in 10 minutes or less after the crash" indicator variable 


hl20 


"help arrived in 20 minutes or less after the crash" indicator variable 


Ind 


"license state of the vehicle at fault is Indiana" indicator variable 


intercept 


"constant term (intercept)" quantitative variable 


jol.)t;iici 


""ai'ler work li(.)urb; iiuiii 1();0() lu 19;0()" iiiclit'akjr \"arial.)io 


light 


"daylight OR street lights are lit up if dark" indicator variable 


maxpass 


"the largest number of occupants in all vehicles involved" quantitative variable 


mm 


"two male drivers are involved" indicator variable (used only if a 2- vehicle accident) 


morn 


"morning hours: 5:00 to 9:00" indicator variable 


moto 


"the vehicle at fault is a motorcycle" indicator variable 
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Table 4 



(Continued) 



Variable 


Description 


nigh 


"late night hours: 1:00 to 5:00" indicator variable 


nocons 


"no construction at the accident location" indicator variable 


nojun 


"no road junction at the accident location" indicator variable 


nonroad 


"non-roadway crash (parking lot, etc.)" indicator variable 


nosig 


"no any traffic control device for the vehicle at fault" indicator variable 


olddrv 


"the driver at fault is older than the other driver" indicator var. (if a 2-vehicle accident) 


oldvagc 


"age (in years) of the oldest vehicle involved" indicator variable 


othUS 


"license state of the vehicle at fault is a U.S. state except Indiana and its neighboring 
states (IL, KY, OH, MI)" indicator variable 


precip 


"precipitation: rain OR snow OR sleet OR hail OR freezing rain" indicator variable 


priv 


"road traveled by the vehicle at fault is a private drive" indicator variable 


r21 


"road traveled by the vehicle at fault is two-lane AND one-way" indicator variable 


rmd2 


"road traveled by the vehicle at fault is multi-lane AND divided two-way" indicator var. 


singSUV 


"one of the two vehicles involved is a pickup OR a van OR a sport utility vehicle" 

iiidicaloi' \'aiialjie (use([ only if a 2-\'C'liit-le ac(::i< [cut) 


singTR 


"one of the two vehicles is a truck OR a tractor" indicator var. (if a 2-vehicle accident) 


slusli 


"roadway surface is covered by snow/slush" indicator variable 


snow 


"snowing weather" indicator variable 


str 


"road is straight" indicator variable 


sum 


"summer season" indicator variable 


sund 


"Sunday" indicator variable 


tliday 


"Thursday" indicator variable 


vage 


"age (in years) of the vehicle at fault" quantitative variable 


veh 


"primary cause of accident is vehicle-related" indicator variable 


voldg 


"the vehicle at fault is more than 7 years old" indicator variable 


voldo 


"age of the oldest vehicle involved is more than 7 years" indicator variable 


wall 


"road median is a wall" indicator variable 


way4 


"accident location is at a 4-way intersection" indicator variable 


wint 


"winter season" indicator variable 


Xl2 


"road type" indicator variable (1 if urban, if rural) 


X27 


"number of occupants in the vehicle at fault" quantitative variable 


X29 


"speed limit" quantitative var. (used if known and the same for all vehicles involved) 


X-i3 


"at least one of the vehicles involved was on fire" indicator variable 


X34 


"age (in years) of the driver at fault" quantitative variable 




"gender of the driver at fault" indicator variable (1 if female, if male) 
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Table 5 



Correlations of the posterior probabilities P{st = 1|Y) with each other and with 
weather-condition variables (for the MSML model) 





1-vehicle, 


1-vehicle, 


1-vehiclc, 


1-vchiclc, 


1-vehicle, 


2- vehicle. 




interstates 


US routes 


state routes 


county roads 


streets 


streets 


l-vehicle, interstates 


1 


0.418 


0.293 


0.606 


-0.013 


-0.173 


1-vehicle, US routes 


0.418 


1 


0.263 


0.688 


-0.070 


-0.155 


l-vehicle, state routes 


0.293 


0.263 


1 


0.409 


-0.047 


-0.035 


1-vehicle, county roads 


0.606 


0.688 


0.409 


1 


-0.022 


-0.051 


l-vehicle, streets 


-0.013 


-0.070 


—0.047 


—0.022 


1 


0.115 


2-vehicle, streets 


-0.173 


-0.155 


-0.035 


-0.051 


0.115 


1 






All 


year 








Precipitation (inch) 


-0.139 


-0.060 


0.096 


-0.037 


0.067 


0.146 


Temperature {°F) 


-0.606 


-0.439 


-0.234 


-0.665 


0.231 


0.220 


Snowfall (inch) 


0.479 


0.635 


0.319 


0.723 


0.003 


-0.100 


> 0.0 (dummy) 


0.695 


0.412 


0.382 


0.695 


-0.142 


-0.131 


> 0.1 (dummy) 


0.532 


0.585 


0.328 


0.847 


-0.046 


-0.161 


Wind gust (mph) 


0.108 


0.100 


0.087 


0.206 


0.164 


0.051 


Fog / Frost (dummy) 


0.093 


0.164 


0.193 


0.167 


0.047 


0.119 


Visibility distance (mile) 


-0.228 


-0.221 


-0.172 


-0.298 


-0.019 


-0.081 


Winter (November - March) 


Precipitation (inch) 


-0.134 


-0.037 


0.027 


-0.053 


0.065 


0.356 


Temperature (°F) 


-0.595 


-0.479 


-0.397 


-0.735 


-0.008 


0.236 


Snowfall (inch) 


0.439 


0.592 


0.375 


0.645 


0.157 


-0.110 


> 0.0 (dummy) 


0.596 


0.282 


0.475 


0.607 


0.115 


-0.142 


> 0.1 (dummy) 


0.445 


0.518 


0.370 


0.789 


0.112 


-0.210 


Wind gust (mph) 


0.302 


0.134 


0.122 


0.353 


0.237 


0.071 


Frost (dummy) 


0.537 


0.544 


0.440 


0.716 


0.052 


-0.225 


Visibility distance (mile) 


-0.251 


-.304 


-0.249 


-0.380 


-0.155 


-0.109 


Summer (May - September) 


Precipitation (inch) 


0.000 


0.006 


0.259 


0.096 


0.047 


-0.063 


Temperature {°F) 


0.179 


0.149 


0.113 


0.037 


0.062 


0.155 


Snowfall (inch) 














> 0.0 (dummy) 














> 0.1 (dummy) 














Wind gust (mph) 


-0.126 


-.009 


0.164 


0.029 


0.121 


0.034 


Fog (dummy) 


0.203 


0.193 


0.275 


0.101 


-0.076 


-0.011 


Visibility distance (mile) 


-0.139 


-0.124 


-0.062 


-0.009 


0.077 


-0.094 
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